Causal Discovery from Databases with Discrete and Continuous Variables
نویسندگان
چکیده
Bayesian Constraint-based Causal Discovery (BCCD) is a state-of-the-art method for robust causal discovery in the presence of latent variables. It combines probabilistic estimation of Bayesian networks over subsets of variables with a causal logic to infer causal statements. Currently BCCD is limited to discrete or Gaussian variables. Most of the real-world data, however, contain a mixture of discrete and continuous variables. We here extend BCCD to be able to handle combinations of discrete and continuous variables, under the assumption that the relations between the variables are monotonic. To this end, we propose a novel method for the efficient computation of BIC scores for hybrid Bayesian networks. We demonstrate the accuracy and efficiency of our approach for causal discovery on simulated data as well as on real-world data from the ADHD-200 competition.
منابع مشابه
Causal discovery from databases with discrete and continuous variables1
Causal discovery is widely used for analysis of experimental data focusing on the exploratory analysis and suggesting probable causal dependencies. There is a variety of causal discovery algorithms in the literature. Some of these algorithms rely on the assumption that there are no latent variables in the model; others do not provide a scoring metric to easily compare the reliability of two can...
متن کاملCombining Linear Non-Gaussian Acyclic Model with Logistic Regression Model for Estimating Causal Structure from Mixed Continuous and Discrete Data
Estimating causal models from observational data is a crucial task in data analysis. For continuousvalued data, Shimizu et al. have proposed a linear acyclic non-Gaussian model to understand the data generating process, and have shown that their model is identifiable when the number of data is sufficiently large. However, situations in which continuous and discrete variables coexist in the same...
متن کاملThe Discovery of Generalized Causal Models with Mixed Variables Using MML Criterion
One major difficulty frustrating the application of linear causal models is that they are not easily adapted to cope with discrete data. This is unfortunate since most real problems involve both continuous and discrete variables. In this paper, we consider a class of graphical models which allow both continuous and discrete variables, and propose the parameter estimation method and a structure ...
متن کاملContinuous Discrete Variable Optimization of Structures Using Approximation Methods
Optimum design of structures is achieved while the design variables are continuous and discrete. To reduce the computational work involved in the optimization process, all the functions that are expensive to evaluate, are approximated. To approximate these functions, a semi quadratic function is employed. Only the diagonal terms of the Hessian matrix are used and these elements are estimated fr...
متن کاملA Comparison of Association Rule Discovery and Bayesian Network Causal Inference Algorithms to Discover Relationships in Discrete Data
Association rules discovered through attribute-oriented induction are commonly used in data mining tools to express relationships between variables. However, causal inference algorithms discover more concise relationships between variables, namely, relations of direct cause. These algorithms produce regressive structured equation models for continuous linear data and Bayes networks for discrete...
متن کامل